Search results

1 – 2 of 2
Article
Publication date: 27 August 2014

Paolo Manghi, Michele Artini, Claudio Atzori, Alessia Bardi, Andrea Mannocci, Sandro La Bruzzo, Leonardo Candela, Donatella Castelli and Pasquale Pagano

The purpose of this paper is to present the architectural principles and the services of the D-NET software toolkit. D-NET is a framework where designers and developers find the…

Abstract

Purpose

The purpose of this paper is to present the architectural principles and the services of the D-NET software toolkit. D-NET is a framework where designers and developers find the tools for constructing and operating aggregative infrastructures (systems for aggregating data sources with heterogeneous data models and technologies) in a cost-effective way. Designers and developers can select from a variety of D-NET data management services, can configure them to handle data according to given data models, and can construct autonomic workflows to obtain personalized aggregative infrastructures.

Design/methodology/approach

The paper provides a definition of aggregative infrastructures, sketching architecture, and components, as inspired by real-case examples. It then describes the limits of current solutions, which find their lacks in the realization and maintenance costs of such complex software. Finally, it proposes D-NET as an optimal solution for designers and developers willing to realize aggregative infrastructures. The D-NET architecture and services are presented, drawing a parallel with the ones of aggregative infrastructures. Finally, real-cases of D-NET are presented, to show-case the statement above.

Findings

The D-NET software toolkit is a general-purpose service-oriented framework where designers can construct customized, robust, scalable, autonomic aggregative infrastructures in a cost-effective way. D-NET is today adopted by several EC projects, national consortia and communities to create customized infrastructures under diverse application domains, and other organizations are enquiring for or are experimenting its adoption. Its customizability and extendibility make D-NET a suitable candidate for creating aggregative infrastructures mediating between different scientific domains and therefore supporting multi-disciplinary research.

Originality/value

D-NET is the first general-purpose framework of this kind. Other solutions are available in the literature but focus on specific use-cases and therefore suffer from the limited re-use in different contexts. Due to its maturity, D-NET can also be used by third-party organizations, not necessarily involved in the software design and maintenance.

Details

Program, vol. 48 no. 4
Type: Research Article
ISSN: 0033-0337

Keywords

Open Access
Article
Publication date: 29 June 2020

Paolo Manghi, Claudio Atzori, Michele De Bonis and Alessia Bardi

Several online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate…

4540

Abstract

Purpose

Several online services offer functionalities to access information from “big research graphs” (e.g. Google Scholar, OpenAIRE, Microsoft Academic Graph), which correlate scholarly/scientific communication entities such as publications, authors, datasets, organizations, projects, funders, etc. Depending on the target users, access can vary from search and browse content to the consumption of statistics for monitoring and provision of feedback. Such graphs are populated over time as aggregations of multiple sources and therefore suffer from major entity-duplication problems. Although deduplication of graphs is a known and actual problem, existing solutions are dedicated to specific scenarios, operate on flat collections, local topology-drive challenges and cannot therefore be re-used in other contexts.

Design/methodology/approach

This work presents GDup, an integrated, scalable, general-purpose system that can be customized to address deduplication over arbitrary large information graphs. The paper presents its high-level architecture, its implementation as a service used within the OpenAIRE infrastructure system and reports numbers of real-case experiments.

Findings

GDup provides the functionalities required to deliver a fully-fledged entity deduplication workflow over a generic input graph. The system offers out-of-the-box Ground Truth management, acquisition of feedback from data curators and algorithms for identifying and merging duplicates, to obtain an output disambiguated graph.

Originality/value

To our knowledge GDup is the only system in the literature that offers an integrated and general-purpose solution for the deduplication graphs, while targeting big data scalability issues. GDup is today one of the key modules of the OpenAIRE infrastructure production system, which monitors Open Science trends on behalf of the European Commission, National funders and institutions.

Details

Data Technologies and Applications, vol. 54 no. 4
Type: Research Article
ISSN: 2514-9288

Keywords

1 – 2 of 2